Enhancing sample-based scheduler with collaborate-state in big data cluster
نویسندگان
چکیده
Sample-based scheduler design has become an emerging research topic for its high scalability and simple scheduling process in today’s big data cluster. One major limitation of such design is its lack of global cluster knowledge, which leads to sub-optimal decisions. Some cutting edge schedulers solve this issue by deploying an extra centralized component in the cluster to capture the real-time cluster state and inform all schedulers. However, such solution is with high cost and low scalability. As an alternative, we introduce the Collaborated-Cluster State(CCS) technique in this paper. CCS is a low cost solution that merely harms the scalability of sample-based design, while achieving similar performance gain as ECC. Experiments with Google and Yahoo production trace both show that CCS under most scenarios can keep up with ECC’s performance while reducing 87.7% (in Google trace) and 73.9% (in Yahoo trace) of communications. 1 Background and Introduction Sample-based scheduling is currently one of the most promising branches of distributed scheduling. It is both fully distributed and low-latency. Currently, the cutting edge sample-based schedulers adopt the batch-probing sampling process proposed by Sparrow. The batch-probing sampling process contains following key steps: 1. In order to schedule a t-task-job, a scheduler randomly samples 2t nodes from the cluster (the ratio of sampling of 2 follows the power of two law[1], but could also be changed to other value). 2. The scheduler sends each selected node a probe request. 3. Each probe is then queued in the worker node to serve as task reservation. 4. When a task reservation is at the head of the queue and is ready to execute, the worker replies to the scheduler and gets a task to run. 5. Once the scheduler sends all t task to run, it cancels all remaining reservations for the job by notifying according workers. 6. Worker will notify the scheduler of the completion of the assigned task. More details could be found DOI reference number: 10.18293/SEKE2017-136 in the Sparrow paper[2]. One crucial problem of these sample-based schedulers is their lack of global knowledge of the cluster status. During each decision process, a scheduler can only communicate with a very small number (e.g. 2t in above example) of sampled worker nodes and makes decision based on information gathered from those nodes.This limits the sample-based scheduler and leads to sub-optimal task placement decision. An example of sub-optimal decisions made by samplebased schedulers is shown in Figure 1. In this example, the scheduler has randomly sampled two busy worker nodes, hence the task under schedule will experience queue waiting no matter which worker is chosen. However, there are six available worker nodes elsewhere in the cluster, this queue waiting could have been avoided if the scheduler had that information. Such a decision is considered sub-optimal since there are better decisions that existed and should be found in the cluster. However, for most cases scheduler itself can’t realize a sub-optimal decision is been made; for instance scheduler in both Figure 2 and Figure 1 will consider themselves in the same status. Some cutting edge scheduler mitigate this problem by using a centralized software component to synchronize cluster status and push collected global cluster knowledge to each samplebased scheduler (refer to as EXCC). This centralized component can either be a share-state master, which is specifically designed to synchronize task and resource status with all worker nodes, or an independent centralized scheduler, which is responsible to schedule and inform the sample-based schedulers at the same time. Either way, the centralized component has to synchronize with all worker nodes to capture the real-time status of the cluster and then inform all the sample-based schedulers accordingly. Although this EXCC solution have proven to effectively reduce sub-optimal decision and improve scheduling precision, EXCC is very expensive to implement. Since the centralized component is required to synchronize with all workers in cluster in order to collect global cluster information, the component itself becomes a potential system bottleneck, which is the priFigure 1: A sub-optimal task placement decision. Figure 2: A good task placement decision. mary goal for sample-based schedulers to avoid. Moreover, a large amount of extra communications is needed to capture the real-time status of cluster. In this paper, we propose collaborated-cluster-state (CCS), an alternative design that is low-cost and scalable, which allows the scheduler acquire global knowledge while keeping the simplicity of sample-based scheduling design. 2 Collaborated-Cluster-State In this paper, we introduce a novel technique, collaboratedcluster-state (CCS), which provides sample-based schedulers with global cluster knowledge through scheduler collaboration. CCS keeps track of all occupied and available resource status by receiving updates from schedulers only. Each scheduler packs task start and complete information along with necessary task details together and send it to CCS periodically. CCS also periodically pushes its current accumulated knowledge to all schedulers. This pushed copy of CCS knowledge can provide each scheduler with advice on which workers should be probed and which should not. In CCS, cluster resource information is stored in the form of an array of size N , in which N represents the number of worker nodes in the cluster. Each element in the array caches the resource status of the corresponding worker node in the cluster in a tuple ( ~ ra, ~ ro), where ~ ra denotes the available resource quantity in a worker node; ~ ro denotes the occupied resource quantity in a worker node. For instance, CCS[10] = (3, 4, 500, 5, 12, 200) means that the 10 worker node in the cluster has 3 core, 4GB of DRAM and 500GB of Flash available and 5 core, 12GB of DRAM and 200GB of Flash occupied. The overall design to maintain and use CCS knowledge in sample-based scheduling is shown in Figure 3.The CCS component needs to be initialized before each scheduler starts to function. Upon initialization, the CCS component reads in worker registration information from the cluster daemon to acCLUSTER CCS SCHEDULER tsk sart/ tsk co m p te C C S co py Figure 3: Overview of sample based scheduling with CCS quire the maximum resource capacity of each worker node and marks them as available. After initialization, CCS component first waits for incoming messages from each scheduler. When a message arrives, CCS unpacks the message to recover the task start or complete information. Each piece of information is represented as one tuple. CCS then processes these tuples one by one. For each tuple, the component reads in the following information: whether the tuple represents task start or complete; the worker nodes of which each task runs on; the amount of resources each task claims. CCS then finds that worker node in its data and updates the amount of available and occupied resources accordingly. CCS also pushes a copy of the cluster state knowledge to all schedulers periodically. The interval of push, ω, is preset, where smaller ω value makes copies in each scheduler more precise but requires more communication while bigger ω value leads to the opposite situation. Each CCS copy also has an expire period β, to prevent the failure of CCS component. Each scheduler receives a CCS copy and only keeps the latest version. At the beginning of each scheduling process, instead of choosing sample target at complete random, it checks with its own CCS copy first. Suppose 2t workers need to be selected for a t-task-job and each task claims resources ~ rc, the scheduler finds in its CCS copy for 2t worker that has CCS[worker]. ~ ra ≥ ~ rc. If more than 2t workers are found to be qualified, the scheduler chooses workers with most available resource by default. If less than 2t qualified workers are found, the scheduler chooses from the rest of workers randomly. After sample target is chosen, the rest of decisioning process follows the common batch-probing process. The scheduler probes towards selected workers and places task reservations. When the scheduler is notified by any worker that one reservation is ready to execute, the scheduler places a task with that worker. It keeps doing so until all t tasks are placed. Once all tasks are placed, it cancels all the leftover reservations in the cluster. When the scheduler places task towards a worker, it caches a tuple [1, workerID, rc] locally. When the schedTable 1: Number/distribution of job and task in trace Google(2011) Yahoo(2011) Number of Jobs 506.4k 24.2k Number of Tasks 17889.7k 968.3k % Jobs ed ≤ 1000s 89.0% 96.6% % Jobs ed ≤ 100s 42.7% 36.4% % Jobs ed ≤ 10s 0.0% 2.7% % Tasks ed ≤ 1000s 69.7% 84.6% % Tasks ed ≤ 100s 7.6% 54.0% % Tasks ed ≤ 10s 0.0% 20.2% uler is notified that a task is completed, the scheduler caches a tuple [0, workerID, rc] locally. The first variable in tuple represents task start/complete, where 1 represents start and 0 represents finished. The second variable is the ID of the current worker, and the last variable is the amount of resources the task claims. Periodically (with an fixed interval γ) it packs all cached tuples together and sends them to CCS. Each tuple will be sent only once.
منابع مشابه
The Relationship Between the Big Five Personality Traits with Depression : The Mediating Role of Self-esteem and Self-efficacy
Aim: The purpose of this study was to investigate the relationship between the five major traits of personality and symptoms of depression in students, with mediating role of self-esteem and self-efficacy. Methods: The research method was descriptive of correlational type. In order to select the sample, 400 students studying at Gonbadkavus Azad and State Universities were selected through clust...
متن کاملتعیین سهم جهت گیری هدف، حمایت سازمانی ادراک شده و پنج عامل بزرگ شخصیت در پیش بینی اشتیاق شغلی معلمان زن شهرستان شهرکرد
Introduction: work engagement is a stable and positive state of mind associated with work, which protects against job burnout. This study was conducted to determine the contribution of goal orientation, perceived organizational support and Big Five personality traits in predicting job motivation of female teachers in Shahrekord, Iran. Methods: The design of the study was correlational-survey...
متن کاملEnhancement of CURE Clustering Technique in Spatial Data Mining Using Oracle 11G
CURE Clustering divides the data sample into groups by identifying few representative points from each group of the data sample. This paper presents enhanced CURE as a clustering technique for data mining, in this approach we have a specially designed pattern as representative to form enhancement in CURE clustering to make it more usable efficiently on big data. Oracle 11G is used as backend wi...
متن کاملAnalysis of Information Management and Scheduling Technology in Hadoop
Development of big data computing has brought many changes to society and social life is constantly digitized. ‘How to handle vast amounts of data’ has become a more and more fashionable topic. Hadoop is a distributed computing software framework, which includes HDFS and MapReduce distributed computing method, make distributed processing huge amounts of data possible. Then job scheduler determi...
متن کاملHadoop Map Reduce Job Scheduler Implementation and Analysis in Heterogeneous Environment
Hadoop MapReduce is one of the popular framework for BigData analytics. MapReduce cluster is shared among multiple users with heterogeneous workloads. When jobs are concurrently submitted to the cluster, resources are shared among them so system performance might be degrades. The issue here is that schedule the tasks and provide the fairness of resources to all jobs. Hadoop supports different s...
متن کامل